Original article Assessment of community-submitted ontology annotations from a novel database-journal partnership
نویسندگان
چکیده
As the scientific literature grows, leading to an increasing volume of published experimental data, so does the need to access and analyze this data using computational tools. The most commonly used method to convert published experimental data on gene function into controlled vocabulary annotations relies on a professional curator, employed by a model organism database or a more general resource such as UniProt, to read published articles and compose annotation statements based on the articles’ contents. A more cost-effective and scalable approach capable of capturing gene function data across the whole range of biological research organisms in computable form is urgently needed. We have analyzed a set of ontology annotations generated through collaborations between the Arabidopsis Information Resource and several plant science journals. Analysis of the submissions entered using the online submission tool shows that most community annotations were well supported and the ontology terms chosen were at an appropriate level of specificity. Of the 503 individual annotations that were submitted, 97% were approved and community submissions captured 72% of all possible annotations. This new method for capturing experimental results in a computable form provides a cost-effective way to greatly increase the available body of annotations without sacrificing annotation quality.
منابع مشابه
Assessment of community-submitted ontology annotations from a novel database-journal partnership
As the scientific literature grows, leading to an increasing volume of published experimental data, so does the need to access and analyze this data using computational tools. The most commonly used method to convert published experimental data on gene function into controlled vocabulary annotations relies on a professional curator, employed by a model organism database or a more general resour...
متن کاملOriginal article Tetrahymena genome database Wiki: a community-maintained model organism database
When funding for Tetrahymena Genome Database (TGD) ended in 2006, no further updates were made to this important community resource and the main database was taken offline in 2008. We have restored and updated this important resource for use by the Tetrahymena research community. We have also retooled the TGD website (now TGD Wiki) to allow members of the community to directly update the inform...
متن کاملThe use of gene ontology evidence codes in preventing classifier assessment bias
MOTIVATION The biological community's reliance on computational annotations of protein function makes correct assessment of function prediction methods an issue of great importance. The fact that a large fraction of the annotations in current biological databases are based on computational methods can lead to bias in estimating the accuracy of function prediction methods. This can happen since ...
متن کاملThe Confidence Information Ontology: a step towards a standard for asserting confidence in annotations
Biocuration has become a cornerstone for analyses in biology, and to meet needs, the amount of annotations has considerably grown in recent years. However, the reliability of these annotations varies; it has thus become necessary to be able to assess the confidence in annotations. Although several resources already provide confidence information about the annotations that they produce, a standa...
متن کاملSUBMITTED AS AN APPLICATION NOTE Title: GOChase: Correcting Errors from Gene Ontology-based Annotations for Gene Products
Summary: The Gene Ontology (GO) is a controlled biological vocabulary that provides three structured networks of terms to describe biological processes, cellular components, and molecular functions. Many databases of gene products are annotated using the GO vocabularies. We found that some GO-updating operations are not easily traceable by the current biological databases and GO browsers. Conse...
متن کامل